---
title: Accessing an SQL Database with PXF (JDBC)
---

<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements.  See the NOTICE file
distributed with this work for additional information
regarding copyright ownership.  The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License.  You may obtain a copy of the License at

  http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied.  See the License for the
specific language governing permissions and limitations
under the License.
-->

Some of your data may already reside in an external SQL database. PXF provides access to this data via the PXF JDBC connector. The JDBC connector is a JDBC client. It can read data from and write data to SQL databases including MySQL, ORACLE, Microsoft SQL Server, DB2, PostgreSQL, Hive, and Apache Ignite.

This section describes how to use the PXF JDBC connector to access data in an external SQL database, including how to create and query or insert data into a PXF external table that references a table in an external database.

<div class="note">The JDBC connector does not guarantee consistency when writing to an external SQL database. Be aware that if an <code>INSERT</code> operation fails, some data may be written to the external database table. If you require consistency for writes, consider writing to a staging table in the external database, and loading to the target table only after verifying the write operation.</div>

## <a id="prereq"></a>Prerequisites

Before you access an external SQL database using the PXF JDBC connector, ensure that:

- You can identify the PXF runtime configuration directory (`$PXF_BASE`).
- You have configured PXF, and PXF is running on each Greenplum Database segment host. See [Configuring PXF](instcfg_pxf.html) for additional information.
- Connectivity exists between all Greenplum Database segment hosts and the external SQL database.
- You have configured your external SQL database for user access from all Greenplum Database segment hosts.
- You have registered any JDBC driver JAR dependencies.
- (Recommended) You have created one or more named PXF JDBC connector server configurations as described in [Configuring the PXF JDBC Connector](jdbc_cfg.html).

## <a id="datatypes"></a>Data Types Supported

The PXF JDBC connector supports the following data types:

- INTEGER, BIGINT, SMALLINT
- REAL, FLOAT8
- NUMERIC
- BOOLEAN
- VARCHAR, BPCHAR, TEXT
- DATE
- TIMESTAMP
- BYTEA

Any data type not listed above is not supported by the PXF JDBC connector.

**Note**: The JDBC connector does not support reading or writing Hive data stored as a byte array (`byte[]`).

## <a id="queryextdata"></a>Accessing an External SQL Database
The PXF JDBC connector supports a single profile named `jdbc`. You can both read data from and write data to an external SQL database table with this profile. You can also use the connector to run a static, named query in external SQL database and read the results.

To access data in a remote SQL database, you create a readable or writable Greenplum Database external table that references the remote database table. The Greenplum Database external table and the remote database table or query result tuple must have the same definition; the column names and types must match.

Use the following syntax to create a Greenplum Database external table that references a remote SQL database table or a query result from the remote database:

<pre>
CREATE [READABLE | WRITABLE] EXTERNAL TABLE &lt;table_name>
    ( &lt;column_name> &lt;data_type> [, ...] | LIKE &lt;other_table> )
LOCATION ('pxf://&lt;external-table-name>|query:&lt;query_name>?<b>PROFILE=jdbc[&SERVER=&lt;server_name>]</b>[&&lt;custom-option>=&lt;value>[...]]')
FORMAT 'CUSTOM' (FORMATTER='pxfwritable_import'|'pxfwritable_export');
</pre>


The specific keywords and values used in the Greenplum Database [CREATE EXTERNAL TABLE](https://gpdb.docs.pivotal.io/latest/ref_guide/sql_commands/CREATE_EXTERNAL_TABLE.html) command are described in the table below.

| Keyword  | Value |
|-------|-------------------------------------|
| \<external&#8209;table&#8209;name\>    | The full name of the external table. Depends on the external SQL database, may include a schema name and a table name. |
| query:\<query_name\>    | The name of the query to execute in the remote SQL database. |
| PROFILE    | The `PROFILE` keyword value must specify `jdbc`. |
| SERVER=\<server_name\>   | The named server configuration that PXF uses to access the data. PXF uses the `default` server if not specified. |
| \<custom&#8209;option\>=\<value\>  | \<custom-option\> is profile-specific. `jdbc` profile-specific options are discussed in the next section.|
| FORMAT 'CUSTOM' | The JDBC `CUSTOM` `FORMAT` supports the built-in `'pxfwritable_import'` `FORMATTER` function for read operations and the built-in `'pxfwritable_export'` function for write operations. |

**Note**: You cannot use the `HEADER` option in your `FORMAT` specification when you create a PXF external table.


### <a id="jdbcoptions"></a>JDBC Custom Options

You include JDBC connector custom options in the `LOCATION` URI, prefacing each option with an ampersand `&`. `CREATE EXTERNAL TABLE` \<custom-option\>s supported by the `jdbc` profile include:

| Option Name   | Operation | Description
|---------------|------------|--------|
| BATCH_SIZE | Write | Integer that identifies the number of `INSERT` operations to batch to the external SQL database. Write batching is enabled by default; the default value is 100. |
| FETCH_SIZE | Read | Integer that identifies the number of rows to buffer when reading from an external SQL database. Read row batching is enabled by default; the default read fetch size is 1000. |
| QUERY_TIMEOUT | Read/Write | Integer that identifies the amount of time (in seconds) that the JDBC driver waits for a statement to execute. The default wait time is infinite. |
| POOL_SIZE | Write | Enable thread pooling on `INSERT` operations and identify the number of threads in the pool. Thread pooling is disabled by default. |
| PARTITION_BY | Read | Enables read partitioning. The partition column, \<column-name\>:\<column-type\>. You may specify only one partition column. The JDBC connector supports `date`, `int`, and `enum` \<column-type\> values, where `int` represents any JDBC integral type. If you do not identify a `PARTITION_BY` column, a single PXF instance services the read request. |
| RANGE | Read | Required when `PARTITION_BY` is specified. The query range; used as a hint to aid the creation of partitions. The `RANGE` format is dependent upon the data type of the partition column. When the partition column is an `enum` type, `RANGE` must specify a list of values, \<value\>:\<value\>[:\<value\>[...]], each of which forms its own fragment. If the partition column is an `int` or `date` type, `RANGE` must specify \<start-value\>:\<end-value\> and represents the interval from \<start-value\> through \<end-value\>, inclusive. The `RANGE` for an `int` partition column may span any 64-bit signed integer values. If the partition column is a `date` type, use the `yyyy-MM-dd` date format. |
| INTERVAL | Read | Required when `PARTITION_BY` is specified and of the `int`, `bigint`, or `date` type. The interval, \<interval-value\>[:\<interval-unit\>], of one fragment. Used with `RANGE` as a hint to aid the creation of partitions. Specify the size of the fragment in \<interval-value\>. If the partition column is a `date` type, use the \<interval-unit\> to specify `year`, `month`, or `day`. PXF ignores `INTERVAL` when the `PARTITION_BY` column is of the `enum` type. |
| QUOTE_COLUMNS | Read | Controls whether PXF should quote column names when constructing an SQL query to the external database. Specify `true` to force PXF to quote all column names; PXF does not quote column names if any other value is provided. If `QUOTE_COLUMNS` is not specified (the default), PXF automatically quotes *all* column names in the query when *any* column name:<br>- includes special characters, or <br>- is mixed case and the external database does not support unquoted mixed case identifiers. |


#### <a id="batching"></a>Batching Insert Operations (Write)

*When the JDBC driver of the external SQL database supports it*, batching of `INSERT` operations may significantly increase performance.

Write batching is enabled by default, and the default batch size is 100. To disable batching or to modify the default batch size value, create the PXF external table with a `BATCH_SIZE` setting:

- `BATCH_SIZE=0` or `BATCH_SIZE=1` - disables batching
- `BATCH_SIZE=(n>1)` - sets the `BATCH_SIZE` to `n`

When the external database JDBC driver does not support batching, the behaviour of the PXF JDBC connector depends on the `BATCH_SIZE` setting as follows:

- `BATCH_SIZE` omitted - The JDBC connector inserts without batching.
- `BATCH_SIZE=(n>1)` - The `INSERT` operation fails and the connector returns an error.

#### <a id="fetching"></a>Batching on Read Operations

By default, the PXF JDBC connector automatically batches the rows it fetches from an external database table. The default row fetch size is 1000. To modify the default fetch size value, specify a `FETCH_SIZE` when you create the PXF external table. For example:

``` pre
FETCH_SIZE=5000
```

If the external database JDBC driver does not support batching on read, you must explicitly disable read row batching by setting `FETCH_SIZE=0`.

#### <a id="threadpool"></a>Thread Pooling (Write)

The PXF JDBC connector can further increase write performance by processing `INSERT` operations in multiple threads when threading is supported by the JDBC driver of the external SQL database.

Consider using batching together with a thread pool. When used together, each thread receives and processes one complete batch of data. If you use a thread pool without batching, each thread in the pool receives exactly one tuple.

The JDBC connector returns an error when any thread in the thread pool fails. Be aware that if an `INSERT` operation fails, some data may be written to the external database table.

To disable or enable a thread pool and set the pool size, create the PXF external table with a `POOL_SIZE` setting as follows:

- `POOL_SIZE=(n<1)` - thread pool size is the number of CPUs in the system
- `POOL_SIZE=1` - disable thread pooling
- `POOL_SIZE=(n>1)`- set the `POOL_SIZE` to `n`

#### <a id="partitioning"></a>Partitioning (Read)

The PXF JDBC connector supports simultaneous read access from PXF instances running on multiple segment hosts to an external SQL table. This feature is referred to as partitioning. Read partitioning is not enabled by default. To enable read partitioning, set the `PARTITION_BY`, `RANGE`, and `INTERVAL` custom options when you create the PXF external table.

PXF uses the `RANGE` and `INTERVAL` values and the `PARTITON_BY` column that you specify to assign specific data rows in the external table to PXF instances running on the Greenplum Database segment hosts. This column selection is specific to PXF processing, and has no relationship to a partition column that you may have specified for the table in the external SQL database.


Example JDBC \<custom-option\> substrings that identify partitioning parameters:

``` pre
&PARTITION_BY=id:int&RANGE=1:100&INTERVAL=5
&PARTITION_BY=year:int&RANGE=2011:2013&INTERVAL=1
&PARTITION_BY=createdate:date&RANGE=2013-01-01:2016-01-01&INTERVAL=1:month
&PARTITION_BY=color:enum&RANGE=red:yellow:blue
```

When you enable partitioning, the PXF JDBC connector splits a `SELECT` query into multiple subqueries that retrieve a subset of the data, each of which is called a fragment. The JDBC connector automatically adds extra query constraints (`WHERE` expressions) to each fragment to guarantee that every tuple of data is retrieved from the external database exactly once.

For example, when a user queries a PXF external table created with a `LOCATION` clause that specifies `&PARTITION_BY=id:int&RANGE=1:5&INTERVAL=2`, PXF generates 5 fragments: two according to the partition settings and up to three implicitly generated fragments. The constraints associated with each fragment are as follows:

- Fragment 1: WHERE (id < 1) - implicitly-generated fragment for RANGE start-bounded interval
- Fragment 2: WHERE (id >= 1) AND (id < 3) - fragment specified by partition settings
- Fragment 3: WHERE (id >= 3) AND (id < 5) - fragment specified by partition settings
- Fragment 4: WHERE (id >= 5) - implicitly-generated fragment for RANGE end-bounded interval
- Fragment 5: WHERE (id IS NULL) - implicitly-generated fragment

PXF distributes the fragments among Greenplum Database segments. A PXF instance running on a segment host spawns a thread for each segment on that host that services a fragment. If the number of fragments is less than or equal to the number of Greenplum segments configured on a segment host, a single PXF instance may service all of the fragments. Each PXF instance sends its results back to Greenplum Database, where they are collected and returned to the user.

When you specify the `PARTITION_BY` option, tune the `INTERVAL` value and unit based upon the optimal number of JDBC connections to the target database and the optimal distribution of external data across Greenplum Database segments. The `INTERVAL` low boundary is driven by the number of Greenplum Database segments while the high boundary is driven by the acceptable number of JDBC connections to the target database. The `INTERVAL` setting influences the number of fragments, and should ideally not be set too high nor too low. Testing with multiple values may help you select the optimal settings. 


### <a id="jdbc_example_postgresql"></a>Example: Reading From and Writing to a PostgreSQL Table

In this example, you:

- Create a PostgreSQL database and table, and insert data into the table
- Create a PostgreSQL user and assign all privileges on the table to the user
- Configure the PXF JDBC connector to access the PostgreSQL database
- Create a PXF readable external table that references the PostgreSQL table
- Read the data in the PostgreSQL table
- Create a PXF writable external table that references the PostgreSQL table
- Write data to the PostgreSQL table
- Read the data in the PostgreSQL table again

#### <a id="ex_create_pgtbl"></a>Create a PostgreSQL Table

Perform the following steps to create a PostgreSQL table named `forpxf_table1` in the `public` schema of a database named `pgtestdb`, and grant a user named `pxfuser1` all privileges on this table:

1. Identify the host name and port of your PostgreSQL server.

2. Connect to the default PostgreSQL database as the `postgres` user. For example, if your PostgreSQL server is running on the default port on the host named `pserver`:

    ``` shell
    $ psql -U postgres -h pserver
    ```

3. Create a PostgreSQL database named `pgtestdb` and connect to this database:

    ``` sql
    =# CREATE DATABASE pgtestdb;
    =# \connect pgtestdb;
    ```

4. Create a table named `forpxf_table1` and insert some data into this table:

    ``` sql
    =# CREATE TABLE forpxf_table1(id int);
    =# INSERT INTO forpxf_table1 VALUES (1);
    =# INSERT INTO forpxf_table1 VALUES (2);
    =# INSERT INTO forpxf_table1 VALUES (3);
    ```

5. Create a PostgreSQL user named `pxfuser1`:

    ``` sql
    =# CREATE USER pxfuser1 WITH PASSWORD 'changeme';
    ```
 
6. Assign user `pxfuser1` all privileges on table `forpxf_table1`, and exit the `psql` subsystem:

    ``` sql
    =# GRANT ALL ON forpxf_table1 TO pxfuser1;
    =# \q
    ```

    With these privileges, `pxfuser1` can read from and write to the `forpxf_table1` table.

7. Update the PostgreSQL configuration to allow user `pxfuser1` to access `pgtestdb` from each Greenplum Database segment host. This configuration is specific to your PostgreSQL environment. You will update the `/var/lib/pgsql/pg_hba.conf` file and then restart the PostgreSQL server.


#### <a id="ex_jdbconfig"></a>Configure the JDBC Connector 

You must create a JDBC server configuration for PostgreSQL, download the PostgreSQL driver JAR file to your system, copy the JAR file to the PXF user configuration directory, synchronize the PXF configuration, and then restart PXF.

This procedure will typically be performed by the Greenplum Database administrator.

1. Log in to the Greenplum Database master node:

    ``` shell
    $ ssh gpadmin@<gpmaster>
    ```

2. Create a JDBC server configuration for PostgreSQL as described in [Example Configuration Procedure](jdbc_cfg.html#cfg_proc), naming the server/directory `pgsrvcfg`. The `jdbc-site.xml` file contents should look similar to the following (substitute your PostgreSQL host system for `pgserverhost`):

    ``` xml
    <?xml version="1.0" encoding="UTF-8"?>
<configuration>
    <property>
        <name>jdbc.driver</name>
        <value>org.postgresql.Driver</value>
    </property>
    <property>
        <name>jdbc.url</name>
        <value>jdbc:postgresql://pgserverhost:5432/pgtestdb</value>
    </property>
    <property>
        <name>jdbc.user</name>
        <value>pxfuser1</value>
    </property>
    <property>
        <name>jdbc.password</name>
        <value>changeme</value>
    </property>
</configuration>
    ```

3. Synchronize the PXF server configuration to the Greenplum Database cluster:

    ``` shell
    gpadmin@gpmaster$ pxf cluster sync
    ``` 

#### <a id="ex_readjdbc"></a>Read from the PostgreSQL Table

Perform the following procedure to create a PXF external table that references the `forpxf_table1` PostgreSQL table that you created in the previous section, and read the data in the table:

1. Create the PXF external table specifying the `jdbc` profile. For example:

    ``` sql
    gpadmin=# CREATE EXTERNAL TABLE pxf_tblfrompg(id int)
                LOCATION ('pxf://public.forpxf_table1?PROFILE=jdbc&SERVER=pgsrvcfg')
                FORMAT 'CUSTOM' (FORMATTER='pxfwritable_import');
    ```

2. Display all rows of the `pxf_tblfrompg` table:

    ``` sql
    gpadmin=# SELECT * FROM pxf_tblfrompg;
     id
    ----
      1
      2
      3
    (3 rows)
    ```

#### <a id="ex_writejdbc"></a>Write to the PostgreSQL Table

Perform the following procedure to insert some data into the `forpxf_table1` Postgres table and then read from the table. You must create a new external table for the write operation.

1. Create a writable PXF external table specifying the `jdbc` profile. For example:

    ``` sql
    gpadmin=# CREATE WRITABLE EXTERNAL TABLE pxf_writeto_postgres(id int)
                LOCATION ('pxf://public.forpxf_table1?PROFILE=jdbc&SERVER=pgsrvcfg')
              FORMAT 'CUSTOM' (FORMATTER='pxfwritable_export');
    ```

4. Insert some data into the `pxf_writeto_postgres` table. For example:

    ``` sql
    =# INSERT INTO pxf_writeto_postgres VALUES (111);
    =# INSERT INTO pxf_writeto_postgres VALUES (222);
    =# INSERT INTO pxf_writeto_postgres VALUES (333);
    ```

5. Use the `pxf_tblfrompg` readable external table that you created in the previous section to view the new data in the `forpxf_table1` PostgreSQL table:

    ``` sql
    gpadmin=# SELECT * FROM pxf_tblfrompg ORDER BY id DESC;
     id
    -----
     333
     222
     111
       3
       2
       1
    (6 rows)
    ```

## <a id="about_nq"></a>About Using Named Queries

The PXF JDBC Connector allows you to specify a statically-defined query to run against the remote SQL database. Consider using a *named query* when:

- You need to join several tables that all reside in the same external database.
- You want to perform complex aggregation closer to the data source.
- You would use, but are not allowed to create, a `VIEW` in the external database.
- You would rather consume computational resources in the external system to minimize utilization of Greenplum Database resources.
- You want to run a HIVE query and control resource utilization via YARN.

The Greenplum Database administrator defines a query and provides you with the query name to use when you create the external table. Instead of a table name, you specify `query:<query_name>` in the `CREATE EXTERNAL TABLE` `LOCATION` clause to instruct the PXF JDBC connector to run the static query named `<query_name>` in the remote SQL database.

PXF supports named queries only with readable external tables. You must create a unique Greenplum Database readable external table for each query that you want to run.

The names and types of the external table columns must exactly match the names, types, and order of the columns return by the query result. If the query returns the results of an aggregation or other function, be sure to use the `AS` qualifier to specify a specific column name.

For example, suppose that you are working with PostgreSQL tables that have the following definitions:

``` sql
CREATE TABLE customers(id int, name text, city text, state text);
CREATE TABLE orders(customer_id int, amount int, month int, year int);
```

And this PostgreSQL query that the administrator named `order_rpt`:

``` sql
SELECT c.name, sum(o.amount) AS total, o.month
  FROM customers c JOIN orders o ON c.id = o.customer_id
  WHERE c.state = 'CO'
GROUP BY c.name, o.month
```

This query returns tuples of type `(name text, total int, month int)`. If the `order_rpt` query is defined for the PXF JDBC server named `pgserver`, you could create a Greenplum Database external table to read these query results as follows:

``` sql
CREATE EXTERNAL TABLE orderrpt_frompg(name text, total int, month int)
  LOCATION ('pxf://query:order_rpt?PROFILE=jdbc&SERVER=pgserver&PARTITION_BY=month:int&RANGE=1:13&INTERVAL=3')
FORMAT 'CUSTOM' (FORMATTER='pxfwritable_import');
```

This command references a query named `order_rpt` defined in the `pgserver` server configuration. It also specifies JDBC read partitioning options that provide PXF with the information that it uses to split/partition the query result data across its servers/segments.

The PXF JDBC connector automatically applies column projection and filter pushdown to external tables that reference named queries.


### <a id="jdbc_example_namedquery"></a>Example: Reading the Results of a PostgreSQL Query

In this example, you:

- Use the PostgreSQL database `pgtestdb`, user `pxfuser1`, and PXF JDBC connector server configuration `pgsrvcfg` that you created in [Example: Reading From and Writing to a PostgreSQL Database](#jdbc_example_postgresql).
- Create two PostgreSQL tables and insert data into the tables.
- Assign all privileges on the tables to `pxfuser1`.
- Define a named query that performs a complex SQL statement on the two PostgreSQL tables, and add the query to the `pgsrvcfg` JDBC server configuration.
- Create a PXF readable external table definition that matches the query result tuple and also specifies read partitioning options.
- Read the query results, making use of PXF column projection and filter pushdown.

#### <a id="nq_create_2pgtbl"></a>Create the PostgreSQL Tables and Assign Permissions

Perform the following procedure to create PostgreSQL tables named `customers` and `orders` in the `public` schema of the database named `pgtestdb`, and grant the user named `pxfuser1` all privileges on these tables:

1. Identify the host name and port of your PostgreSQL server.

2. Connect to the `pgtestdb` PostgreSQL database as the `postgres` user. For example, if your PostgreSQL server is running on the default port on the host named `pserver`:

    ``` shell
    $ psql -U postgres -h pserver -d pgtestdb
    ```

3. Create a table named `customers` and insert some data into this table:

    ``` sql
    CREATE TABLE customers(id int, name text, city text, state text);
    INSERT INTO customers VALUES (111, 'Bill', 'Helena', 'MT');
    INSERT INTO customers VALUES (222, 'Mary', 'Athens', 'OH');
    INSERT INTO customers VALUES (333, 'Tom', 'Denver', 'CO');
    INSERT INTO customers VALUES (444, 'Kate', 'Helena', 'MT');
    INSERT INTO customers VALUES (555, 'Harry', 'Columbus', 'OH');
    INSERT INTO customers VALUES (666, 'Kim', 'Denver', 'CO');
    INSERT INTO customers VALUES (777, 'Erik', 'Missoula', 'MT');
    INSERT INTO customers VALUES (888, 'Laura', 'Athens', 'OH');
    INSERT INTO customers VALUES (999, 'Matt', 'Aurora', 'CO');
    ```

4. Create a table named `orders` and insert some data into this table:

    ``` sql
    CREATE TABLE orders(customer_id int, amount int, month int, year int);
    INSERT INTO orders VALUES (111, 12, 12, 2018);
    INSERT INTO orders VALUES (222, 234, 11, 2018);
    INSERT INTO orders VALUES (333, 34, 7, 2018);
    INSERT INTO orders VALUES (444, 456, 111, 2018);
    INSERT INTO orders VALUES (555, 56, 11, 2018);
    INSERT INTO orders VALUES (666, 678, 12, 2018);
    INSERT INTO orders VALUES (777, 12, 9, 2018);
    INSERT INTO orders VALUES (888, 120, 10, 2018);
    INSERT INTO orders VALUES (999, 120, 11, 2018);
    ```

5. Assign user `pxfuser1` all privileges on tables `customers` and `orders`, and then exit the `psql` subsystem:

    ``` sql
    GRANT ALL ON customers TO pxfuser1;
    GRANT ALL ON orders TO pxfuser1;
    \q
    ```

#### <a id="nq_jdbconfig"></a>Configure the Named Query

In this procedure you create a named query text file, add it to the `pgsrvcfg` JDBC server configuration, and synchronize the PXF configuration to the Greenplum Database cluster.

This procedure will typically be performed by the Greenplum Database administrator.

1. Log in to the Greenplum Database master node:

    ``` shell
    $ ssh gpadmin@<gpmaster>
    ```

2. Navigate to the JDBC server configuration directory `pgsrvcfg`. For example:

    ``` shell
    gpadmin@gpmaster$ cd $PXF_BASE/servers/pgsrvcfg
    ```
    
3. Open a query text file named `pg_order_report.sql` in a text editor and copy/paste the following query into the file:

    ``` sql
    SELECT c.name, c.city, sum(o.amount) AS total, o.month
      FROM customers c JOIN orders o ON c.id = o.customer_id
      WHERE c.state = 'CO'
    GROUP BY c.name, c.city, o.month
    ``` 

4. Save the file and exit the editor.

5. Synchronize these changes to the PXF configuration to the Greenplum Database cluster:

    ``` shell
    gpadmin@gpmaster$ pxf cluster sync
    ``` 

#### <a id="nq_readjdbc"></a>Read the Query Results

Perform the following procedure on your Greenplum Database cluster to create a PXF external table that references the query file that you created in the previous section, and then reads the query result data:

1. Create the PXF external table specifying the `jdbc` profile. For example:

    ``` sql
    CREATE EXTERNAL TABLE pxf_queryres_frompg(name text, city text, total int, month int)
      LOCATION ('pxf://query:pg_order_report?PROFILE=jdbc&SERVER=pgsrvcfg&PARTITION_BY=month:int&RANGE=1:13&INTERVAL=3')
    FORMAT 'CUSTOM' (FORMATTER='pxfwritable_import');
    ```

    With this partitioning scheme, PXF will issue 4 queries to the remote SQL database, one query per quarter. Each query will return customer names and the total amount of all of their orders in a given month, aggregated per customer, per month, for each month of the target quarter. Greenplum Database will then combine the data into a single result set for you when you query the external table.

2. Display all rows of the query result:

    ``` sql
    SELECT * FROM pxf_queryres_frompg ORDER BY city, total;

     name |  city  | total | month 
    ------+--------+-------+-------
     Matt | Aurora |   120 |    11
     Tom  | Denver |    34 |     7
     Kim  | Denver |   678 |    12
    (3 rows)
    ```

3. Use column projection to display the order total per city:

    ``` sql
    SELECT city, sum(total) FROM pxf_queryres_frompg GROUP BY city;

      city  | sum 
    --------+-----
     Aurora | 120
     Denver | 712
    (2 rows)
    ```

    When you execute this query, PXF requests and retrieves query results for only the `city` and `total` columns, reducing the amount of data sent back to Greenplum Database.

4. Provide additional filters and aggregations to filter the `total` in PostgreSQL:

    ``` sql
    SELECT city, sum(total) FROM pxf_queryres_frompg 
                WHERE total > 100
                GROUP BY city;

      city  | sum 
    --------+-----
     Denver | 678
     Aurora | 120
    (2 rows)
    ```

    In this example, PXF will add the `WHERE` filter to the subquery. This filter is pushed to and executed on the remote database system, reducing the amount of data that PXF sends back to Greenplum Database. The `GROUP BY` aggregation, however, is not pushed to the remote and is performed by Greenplum.

## <a id="jdbc_override"></a>Overriding the JDBC Server Configuration with DDL

You can override certain properties in a JDBC server configuration for a specific external database table by directly specifying the custom option in the `CREATE EXTERNAL TABLE` `LOCATION` clause:

| Custom Option Name   | jdbc-site.xml Property Name |
|----------------------|-----------------------------|
| JDBC_DRIVER | jdbc.driver |
| DB_URL | jdbc.url |
| USER | jdbc.user |
| PASS | jdbc.password |
| BATCH_SIZE | jdbc.statement.batchSize |
| FETCH_SIZE | jdbc.statement.fetchSize |
| QUERY_TIMEOUT | jdbc.statement.queryTimeout |

Example JDBC connection strings specified via custom options:

``` pre
&JDBC_DRIVER=org.postgresql.Driver&DB_URL=jdbc:postgresql://pgserverhost:5432/pgtestdb&USER=pguser1&PASS=changeme
&JDBC_DRIVER=com.mysql.jdbc.Driver&DB_URL=jdbc:mysql://mysqlhost:3306/testdb&USER=user1&PASS=changeme
```
For example:
<pre>CREATE EXTERNAL TABLE pxf_pgtbl(name text, orders int)
  LOCATION ('pxf://public.forpxf_table1?PROFILE=jdbc<b>&JDBC_DRIVER=org.postgresql.Driver&DB_URL=jdbc:postgresql://pgserverhost:5432/pgtestdb&USER=pxfuser1&PASS=changeme</b>')
FORMAT 'CUSTOM' (FORMATTER='pxfwritable_export');</pre>

<div class="note warning">Credentials that you provide in this manner are visible as part of the external table definition. Do not use this method of passing credentials in a production environment.</div>

Refer to [Configuration Property Precedence](cfg_server.html#override) for detailed information about the precedence rules that PXF uses to obtain configuration property settings for a Greenplum Database user.

